Molecular determinants of acrylamide neurotoxicity through covalent docking

Acrylamide (ACR) is formed during food processing by Maillard reaction between sugars and proteins at high temperatures. It is also used in many industries, from water waste treatment to manufacture of paper, fabrics, dyes and cosmetics. Unfortunately, cumulative exposure to acrylamide, either from diet or at the workplace, may result in neurotoxicity. Such adverse effects arise from covalent adducts formed between acrylamide and cysteine residues of several neuronal proteins via a Michael addition reaction. The molecular determinants of acrylamide reactivity and its impact on protein function are not completely understood. Here we have compiled a list of acrylamide protein targets reported so far in the literature in connection with neurotoxicity and performed a systematic covalent docking study. Our results indicate that acrylamide binding to cysteine is favored in the presence of nearby positively charged amino acids, such as lysines and arginines. For proteins with more than one reactive Cys, docking scores were able to discriminate between the primary ACR modification site and secondary sites modified only at high ACR concentrations. Therefore, docking scores emerge as a potential filter to predict Cys reactivity against acrylamide. Inspection of the ACR-protein complex structures provides insights into the putative functional consequences of ACR modification, especially for non-enzyme proteins. Based on our study, covalent docking is a promising computational tool to predict other potential protein targets mediating acrylamide neurotoxicity.

For each protein target, we have used the structures listed in Table S1 to analyze the physicochemical properties and location for each of the candidate Cys residues, as well as their corresponding microenvironment, as explained in the main text (see section 2.4). Most of the cysteines in our dataset happened to be located in enzyme active sites (see Table S1). Most likely this reflects the more readily available purification methods and functional assays for enzymes compared to other protein functional classes (see also below). Cys residues with higher SASA and lower pK a values in Table S1 are more accessible and acidic and thus potentially more reactive. However, SASA and pK a calculations can have limited accuracy due to the dependency on the structure used to represent the protein and the poor performance of pK a predictors for Cys (Awoonor-Williams and Rowley, 2016)). Hence, we have also inspected residues in the vicinity of the candidate cysteines that could potentially favor Cys deprotonation (step 2 in Figure 1 in the main text). His and Asp/Glu (in green and red, respectively, in Table S2) could deprotonate the Cys thiol group, whereas Arg/Lys (in blue) would stabilize the resulting thiolate. In addition, other H-bonding capable residues (in orange in Table S2) could also help make the Cys more acidic (Roos et al., 2013). Figure S1: Ramachandran plots of the homology models built for the dopamine transporter, outward and inward conformations, as well as the NEM-sensitive factor (from left to right). The plots were generated using the RamachanDraw tool (https://pypi.org/project/RamachanDraw/), distributed under the MIT license. Figure S2: Local quality values of the homology models built for the dopamine transporter, outward and inward conformations, and the NEM-sensitive factor (from top to bottom). The plots were generated with the QMEAN webserver (https://swissmodel.expasy.org/qmean/). Table S1: Acrylamide protein targets compiled in this study. Protein structures and cysteine residues considered for the covalent docking calculations are listed. Targets for which homology models of the human proteins had to be generated are indicated by HM and the template structure used between parentheses; further details are provided in the main text (sections 3.2.4 and 3.2.7). Physicochemical characterization and location of the candidate cysteines are also included; n.a. indicates cysteines for which no functionally relevant location was identified. Residue numbers highlighted in bold indicate the main reactive cysteine in protein targets with experimental data, whereas an asterisk marks residues predicted to be targeted by acrylamide based on the results of our covalent docking calculations. 10.2 allosteric site (Noort and Hulst, 2003;Tong et al., 2004;Ferguson et al., 2010) (2) n.a. (Hansch et al., 1986;Dixit et al., 1981) 240* 30.8 n.a. (Dobryszycki et al., 1999a,b)  n.a. (Matsuoka et al., 1996;Lin et al., 1994;Meng et al., 2001;Sheng et al., 2009)  n.a. (Howland et al., 1980)

Validation of the covalent docking approach
Covalent docking to the reactive cysteine of each target protein (see Table S1) was performed using Haddock (version 2.2.) (De Vries et al., 2010;van Zundert et al., 2016). We followed the standard covalent docking protocol of Haddock (HADDOCK developer team, 2018), which was initially developed for covalent inhibitors of cathepsin K (HADDOCK developer team, 2018). Here we have validated it for Cys-ACR adducts using the available experimental structures for such covalent ligand-protein complexes. In particular, we searched in the Protein Data Bank (Berman et al., 2000;Burley et al., 2021) for experimental structures containing the ligand name ROP (i.e. propionamide, the product of the Michael addition reaction, as explained in the main text, section 2.3.1); see (RCSB PDB Ligand Summary ROP, 2022). We then filtered for entries in which this ligand is covalently linked to the protein, rendering a total of five X-ray structures (PDB codes 3ZVI (Raj et al., 2012), 4GYL (Weber et al., 2013), 4IZV, 4IZU, and 4WGF (Groftehauge et al., 2015)), of which one (PDB 4IZU) contains two covalent ACR adducts (with C53 and C145). All these structures correspond to bacterial enzymes and have resolution between 1.4Å and 2.3Å. For the redocking calculations, the protein structures were stripped from the ROP ligand and submitted to the same covalent docking protocol described in the main text (see section 2.3.2). The redocking and crystallographic poses are compared in Table S4, in terms of their protein-ligand interactions. Most of the protein-ligand interactions observed in the crystal structures are reproduced in the redocking poses. Additional interactions are present in the redocking poses, which we ascribed to the increased flexibility of acrylamide in the redocking calculations (which include a final refinement molecular dynamics step at 300 K; see step 3b in section 2.3.2 in the main text) compared to the X-ray structures (solved at 100K-110K). For further information on the redocking calculations, we refer the reader to Supplementary Material 2, which includes the Haddock score and size of the redocking clusters, as well as the Haddock score of the top four poses of each cluster.

Proteins with experimentally verified reactive cysteine
Out of the 19 proteins in our dataset, the specific Cys targeted by acrylamide is known for eight. Below we present the results of the covalent docking performed for each of these proteins. In some cases (creatine kinase, glyceraldehyde-3-phosphate dehydrogenase and hemoglobin) experimental evidence suggests more than one cysteine targeted by acrylamide, but with different reactivity; thus we performed a docking calculation for each of the cysteines within the same protein target separately.
The outcome of these dockings is presented in part in the main text and in part here below. Namely, the main text includes an overview of the results and their discussion, as well as Table 1, which reports the Haddock score and cluster size of the top (best scored) cluster. Here below we show the protein-ligand interaction analysis of the docking poses of the top cluster; additional clusters are also considered if their Haddock scores fall within standard deviation of the top cluster. Such analysis was carried out with ProLIF (Bouysset and Fiorucci, 2021), as explained in section 2.4 of the main text; if no interactions were detected, no scheme is shown. For further details, we refer the reader to Supplementary Material 2, which includes the full report of the docking results.

Creatine Kinase (CK)
Because of the the biphasic time dependent inactivation of CK by ACR observed in enzymatic assays (Sheng et al., 2009), we performed covalent docking for several cysteine residues (see Table S1). Covalent docking for the experimentally known primary site of ACR modification in CK, C283, resulted in one main cluster (number 1); the top pose is shown Figure 4B in the main text and the corresponding protein-ligand interaction fingerprints in Figure S11 below. One main cluster (number 1) was also obtained for the secondary ACR binding site (C141) predicted in this study (see Figures S6 and S7). The docking results for the alternative cysteine discussed in the main text (C146, see section 3.2.2) are shown in Figures S6 and S8-S10. For both C141 and C146, some docking poses showed a distance between ACR and the corresponding Cys too long to be compatible with a covalent bond. Such non-covalently bound ligand poses were excluded when calculating the average Haddock score and cluster size shown in Table 1 in the main text and the protein-ligand interaction fingerprints in Figures S7 and S8-S10. Figure S6: Representative covalent binding poses of ACR and CK C141 (left) and C146 (right). Acrylamide and its surrounding residues are represented as sticks, with carbon atoms colored in green and cyan, respectively. The sulfur atom between the reactive cysteine residue and the adduct is shown as a sphere. Residues forming hydrogen bonds (HBs) with ACR are displayed with thicker sticks and labeled. HBs present in more than 60% of the docking poses are shown with a dashed line. Residues within 5Åof the adduct that can potentially favor the Michael addition reaction are shown with thinner lines.    In addition, we performed a multibody docking to investigate the possibility of simultaneous binding of two ACR molecules to C74 and C283, as proposed in (Sheng et al., 2009). A representative docking pose is displayed in Figure S12, showing that binding of the first ACR molecule to C283 blocks access to C74. Therefore, in our hands, C74 cannot be modified by ACR, unlike the model proposed in reference (Sheng et al., 2009). Figure S12: Representative pose of the multibody docking for creatine kinase. ACR is shown as spheres, C283 and C74 in stick representation and the surface of CK is shown in grey. A molecule of ACR is covalently bound to C283; as a result, C74 is no longer accessible to the second ACR molecule coming from the solution. Indeed, the covalent docking calculation placed the second molecule of ACR (not shown) outside the cavity where C74 and C283 are located.

Dopamine Transporter (DAT)
Covalent docking for C342 of DAT (i.e. the primary site of ACR modification for the wild-type transporter) was performed for both the inward and outward conformations (IF and OF, respectively). One main top cluster (number 1) was obtained for the IF state, but two (clusters 1 and 4) for the OF state.

Glyceraldehyde-3-phosphate dehydrogenase (GAPDH)
For GAPDH, covalent docking was performed for three cysteines: C152, C156 and C247, which can be modified at increasing ACR concentrations (Martyniuk et al., 2011). Docking with C152 resulted in one top cluster (number 1); the top pose is shown in the main text ( Figure 4F) and the corresponding protein-ligand fingerprint in Figure S18. For C156 and C247, the generated docking poses were more diverse, resulting in either two clusters (numbers 1 and 2) or five clusters (numbers 1-5), respectively, with Haddock scores within standard deviation of the top (best scored) cluster. The top docking poses are shown in Figure S17, and the protein-ligand fingerprints of the corresponding dockings are presented in Figure S19 for C156 and Figure S20 for C247. As explained for CK (see section 1.3.2), docking poses with a distance between ACR and the corresponding Cys too long to be compatible with a C-S covalent bond were excluded from the analysis. Figure S17: Representative covalent binding poses of ACR and GAPDH C156 (left) and C247 (right). Acrylamide and its surrounding residues are represented as sticks, with carbon atoms colored in green and cyan, respectively. The sulfur atom between the reactive cysteine residue and the adduct is shown as a sphere. Residues forming hydrogen bonds (HBs) with ACR are displayed with thicker sticks and labeled. HBs present in more than 60% of the docking poses are shown with a dashed line. Residues within 5Åof the adduct that can potentially favor the Michael addition reaction are shown with thinner lines.

Hemoglobin (Hb)
Covalent docking for Hb was performed for two cysteines. For C104 in the α subunit, the resulting poses did not exhibit a properly formed C-S bond, suggesting that adduct formation is less favorable for this cysteine. Instead, for Hb C93 (β subunit), five main clusters (numbers 1-5) were obtained with Haddock score within standard deviation of top (best scored) cluster; their protein-ligand interaction profiles are reported below.

Vesicular proton ATPase (v-ATPase)
Covalent docking for C254 of v-ATPase yielded one main top cluster (number 1); the analysis of the protein-ligand interactions present in the docking poses belonging to that cluster is shown below. As mentioned in the main text, the (only slightly) favorable docking score of this top cluster (−0.1 a.u.) can be attributed to C254 being located in a loop segment of the Walker A motif (GAFGCGKT). This motif is involved in coordinating ATP binding and hydrolysis and hence exhibits large rearrangements during the v-ATPase conformational cycle (see Figure S26). Such structural changes cannot be sampled with covalent docking protocols, thus resulting in lower accuracy of the predicted docking poses.

Proteins without reactive cysteine experimental information: Selection of candidate residues
Out of the 19 proteins in our dataset (Table S1), the specific Cys targeted by acrylamide is not known for eleven (see below; proteins are listed in alphabetical order). In order to narrow down the most promising candidates to be the reactive Cys, we first checked physicochemical properties and microenvironment, as well as conservation and post-translational modifications, for all the cysteines of the respective protein target (see section 2.2 in the main text). Below we present the outcome of these analyses, as well as the main results of the covalent docking calculations performed for each of the selected candidate Cys residues. For further details, we refer the reader to Supplementary Material 2, which includes the full report of the docking results. Here we show the protein-ligand interaction analysis of the docking poses of the top (best scored) cluster, as well as for additional clusters when their Haddock scores fall within standard deviation of the top cluster. If no interactions were detected by ProLIF (Bouysset and Fiorucci, 2021), no scheme is shown.

Alcohol Dehydrogenase (ADH)
Alcohol Dehydrogenase is an enzyme part of the ethanol metabolism that converts alcohols to aldehydes. ADH possesses multiple isoforms in humans which can be grouped into different sub-classes. The crystal structure with the highest resolution belongs to ADH1C, also known as ADH3 (see Table S1). Hence, the subsequent analyses were performed with this isoform.
An MSA of the seven human ADH sequences ( Figure S29) revealed eleven conserved Cys. However, six of them are part of the two Zn 2+ binding sites present in ADHs (Eklund et al., 1976), and thus were discarded as potential ACR binding sites. Out of the remaining five conserved cysteines, we selected C170 and C240 (ADH1C numbering) for our covalent docking tests. C170 is located near the enzyme active site and thus its modification is more likely to have an impact on the protein function. In addition, C240 in ADH appears to be the residue equivalent to C247 in GAPDH; the latter has been shown experimentally to be the target of ACR, though at high concentrations.
The results of the covalent docking for each of these two cysteines, using the aforementioned ADH structure, are reported below and in Supplementary Material 2. Based on the more favorable docking score for C240 (−10.8 a.u.) compared to C170 (−4.2 a.u.), we suggest that C240 might be the primary site of ACR modification in ADH. Figure S29: Multiple sequence alignment of human alcohol dehydrogenase isoforms, performed with the MAFFT webserver (Katoh et al., 2022). The Clustal color code was used, with hydrophobic residues in blue, positively charged in red, negatively charged in magenta, polar in green and aromatic in cyan. Special residues are shown in orange (glycine), yellow (proline) and pink (csyteine) and non-conserved residues are in white.

Aldolase
Aldolase (or fructose-bisphosphate aldolase) is a key enzyme within the glycolysis pathway. It is responsible for splitting fructose 1,6-bisphosphate into dihydroxyacetone phosphate (DHAP) and glyceraldehyde 3-phosphate (GADP). Aldolase has three isoforms: aldolase A (expressed in muscle), aldolase B (liver) and aldolase C (brain). In vitro experiments on rabbit muscle aldolase showed a multiphasic inactivation of aldolase by acrylamide (Dobryszycki et al., 1999a); however, there is no mutagenesis data pinpointing the Cys residues targeted by ACR. Hence, we first checked cysteine conservation by generating a multiple sequence alignment of human and rabbit aldolase isoforms ( Figure S33). We found that C134, C177, C239 and C289 are conserved and thus can be potential candidates for the reactive Cys targeted by ACR and/or have functional relevance.
Next, we used the crystal structure of human liver aldolase (i.e. the highest resolution crystal structure) to perform ACR covalent dockings for each of these cysteines. As shown in Figure S33, this isoform possesses an additional cysteine residue at position 268, which is solvent exposed and thus was also submitted to our docking workflow. As mentioned above, experimental studies showed that aldolase inactivation is dependent on both ACR concentration and time of incubation (Dobryszycki et al., 1999a). This rather complex pattern of inactivation indicates that acrylamide could form multiple covalent adducts at several Cys sites before aldolase activity is completely abolished. Therefore, in this case it is particularly relevant to consider all possible Cys candidates for covalent docking. Figure S33: Multiple sequence alignment of human and rabbit aldolase isoforms, performed with the MAFFT webserver (Katoh et al., 2022). Aldolase isoforms A (muscle), B (liver) and C (brain) were included. The same color code as in Figure S29 was used.
Our computational results showed that C134, C239, C268 and C289 exhibit similar favorable docking scores for their top clusters, i.e. −26.1 a.u., −23.1 a.u., −29.4 a.u. and −24.1 a.u., respectively (see Supplementary Material 2). Below we report the protein-ligand interactions for each of the Cys-ACR covalent adducts. In contrast, the less favorable docking score for C177 (−9.7 a.u.), as well as the lack of a properly formed S-C bond between C177 and ACR in our models, suggest that modification of this particular residue is less likely.

Enolase
Like aldolase and GAPDH, enolase is an enzyme part of the glycolysis pathway and is responsible for the conversion of 2-phosphoglycerate to phosphoenolpyruvate. Different isoforms of enolase exist, assembled as either homo-or heterodimers.
In order to pinpoint possible candidates for subsequent covalent docking, we first identified solvent exposed Cys residues. Out of the six conserved cysteines present in enolase (C118, C336, C356, C338, C356, C388 and C398, see Figure S38), only the last two are significantly solvent exposed and thus accessible to acrylamide. Hence, C388 and C398 were considered for the subsequent covalent docking calculations.
Based on the docking scores obtained (−36.3 and −39.0 a.u.) for C388 and C398, respectively), we suggest that both cysteines can potentially be targeted by acrylamide. Modification of C388 by ACR is supported by experimental studies showing that this cysteine undergoes modification under electrophile-induced oxidative stress, resulting in enzyme inactivation (Ishii and Uchida, 2004). In addition, ACR modification of C398 could further contribute to the experimentally observed protein inhibition (Howland et al., 1980) by altering the quaternary structure of the enzyme, as C398 is located at the interface between two enolase subunits. In this regard, this example highlights the importance to map the location of the candidate reactive Cys on the protein 3D structure in order to establish a connection between ACR modification and its impact on protein function. Figure S38: Multiple sequence alignment of human enolase isoforms (α, β and γ) generated using the MAFFT webserver (Katoh et al., 2022). The same color code as in Figure S29

Estrogen Receptor
Estrogen receptors are nuclear receptors initially identified as binding female steroids of the estrogen group and regulating pathways related to sexual maturity. Alterations of this tightly regulated hormonal balance can lead to pathological side effects, such as tumor formation.
Estrogen receptors have been proposed as potential targets of acrylamide based on the experimental observation that acrylamide induces misregulation of the hormone balance (Hogervorst et al., 2009). Moreover, human estrogen receptors contain conserved cysteine-rich binding domains that can act as reactive sites for ACR. In this regard, mutagenesis data has shown C381 and C530 to be involved in covalent hormone binding (Aliau et al., 1999) and thus these two cysteines could potentially form covalent adducts also with acrylamide.
Both residues showed favorable docking scores for their top cluster, −36.0 and −22.6 a.u. for C381 and C530, respectively. Considering the more favorable docking score for C381, the presence of more protein-ligand HBs and a slightly lower pK a value of 9.9 (compared to >12 of C530), it is reasonable to assume that C381 is the primary site of ACR modification. However, we cannot discard that C530 could still be modified by acrylamide at higher concentrations.

Immunoglobulins (Igs) G1 H Nie and kappa light chain
Acrylamide modification of two immunoglobulins was identified through nano liquid chromatography combined with tandem mass spectrometry (Feng and Lu, 2011b). Mapping of the modified peptides against the corresponding full protein sequences pinpointed C395 (immunoglobulin G1 H Nie) and C134 (immunoglobulin kappa light chain). Inspection of the respective protein structures (see Table S1) showed that both residues are involved in disulfide bridges. However, free thiols and chemical modification of disulfide bridges have been detected in other Igs (Liu and May, 2012). Therefore, we performed covalent dockings for each of these cysteines, assuming that either they are reversibly reduced under certain conditions or acrylamide is able to break and modify the corresponding disulfide bridge.
As shown in Supplementary Material 2, docking to C395 of immunoglobulin G1 H Nie showed one main cluster (number 1), whereas docking to C134 of the immunoglobulin kappa light chain rendered 4 clusters with Haddock scores within standard deviation of top (best scored) cluster. Below we report the protein-ligand interactions of those clusters for which HBs were detected. We speculate that ACR modification of the aforementioned cysteines will affect protein structure and stability of these two Igs, since it removes disulfide bridges.  Kinesins are motor proteins essential to transport cellular cargos along microtubules, powered by ATP hydrolysis. ACR-mediated inhibition of kinesins has been proposed as the molecular mechanism by which acrylamide impairs fast anterograde and retrograde axonal transport (Erkekoglu and Baydar, 2014). Indeed, experimental studies have confirmed that acrylamide is able to inhibit two kinesin motor proteins (Sickles et al., 1996(Sickles et al., , 2007Friedman et al., 2008), namely KIFC5A and KRP2 (later renamed as KIFC1/HSET and KIF2C) (Miki et al., 2001). In order to pinpoint candidate reactive Cys, we first generated a multiple sequence alignment of kinesin sequences homologous to KIFC1/HSET and KIF2C, including human and rat isoforms belonging to kinesin families 14 and 13, respectively ( Figure S47). Noteworthily, the sensitivity to acrylamide is much higher for KIFC5A/KIF2C than for KRP2/KIFC1/HSET (Friedman et al., 2008). Therefore, in this case we focused on cysteine residues that are conserved in one kinesin family but not the other. Figure S47: Multiple sequence alignment rat and human isoforms homologous to the kinesins KIFC1/HSET and KIF2C studied here, belongingto the kinesin 14 and 13 families, respectively. The MSA was generated using the MAFFT webserver (Katoh et al., 2022) and the same color code as in Figure S29 was used.
For KIF2C, we focused on the kinesin ATPase motor domain, as ACR-mediated inhibition of such catalytic domain has been observed for other neuronal proteins, such as NSF or v-ATPase (see main text). Among the six Cys conserved within the kinesin 13 family, integration of structural data revealed that C260 and C287 are more likely to play a functional role. In particular, a study performed for MACK (Talapatra et al., 2015), a hamster homolog of KIF2C, showed that ATPase activity requires dimerization of the motor domain and C260 and C287 are located at this dimeric interface. Moreover, dimerization is promoted by interaction with the C-terminal (CT) domain and C287 is located in the vicinity of CT (Talapatra et al., 2015). Indeed, introduction of a Cys mutation in the CT domain resulted in formation of a disulfide bond with C287 (Talapatra et al., 2015). Therefore, C260 and C287 were selected for covalent docking to KIF2C. Favorable docking scores were obtained for both C260 and C287 (−35.3 and −34.1 a.u., respectively). We surmise that attachment of ACR to either C260 or C287 in KIF2C could interfere with motor domain dimerization and/or CT domain binding.
For KIFC1/HSET, we again relied on experimental data to identify the best candidate residues among the conserved cysteines within the kinesin 14 family. In particular, a previous study identified C663 as an attachment site for covalent inhibitors (Förster et al., 2019). Therefore, we selected C663 for the covalent docking of acrylamide to KIFC1, resulting in a favorable docking score of −14.0 a.u.. We speculate that binding of acrylamide to C663 could impair ATPase activity of KIFC1/HSET based on its structural position. This cysteine is at the C-terminal end of KIFC1 and is part of the so-called α6 helix, which together with α4 forms a cleft known to bind inhibitors (Park et al., 2017). Moreover, the α4-α46 cleft is located adjacent to the P-loop (or walker A motif) responsible for ATP binding and hydrolysis. Thus, ACR modification of C663 could allosterically trigger rearrangements of the P-loop through either helix α4 or α6.

Sex Hormone-Binding Globulin (SHBG)
Sex hormone-binding globulin is a glycoprotein that binds steroid hormones to regulate the amount of free steroid molecules in plasma. SHBG contains cysteine-rich regions essential for ligand binding. Indeed, truncated mutants of rabbit SHBG, which did not contain the disulfide bond formed between C164 and C188, were not able to bind androgens anymore (Wong et al., 2001). Moreover, these two cysteines are highly conserved among G-domains containing proteins, such as rabbit SHBG, and have been shown to contribute to protein stability (Wong et al., 2001). Therefore, we selected C164 and C188 as potential ACR binding sites to perform the subsequent covalent docking calculations. Nonetheless, we would like to note that these two cysteines are involved in a disulfide bridge and thus prior reduction or chemical modification of the disulfide bridge would be needed to react with acrylamide, as discussed in section 1.4.5.
The docking results show favorable docking scores for both candidate cysteines, −23.0 a.u. for C164 and −16.6 a.u. for C188, as well as stabilizing protein-ligand interactions (see figures below). Considering the aforementioned changes in ligand binding and protein stability upon mutation or truncation of these cysteines, it is tempting to speculate that ACR modification of C164 and C188 will have a similar effect.

Topoisomerase IIa
Topoisomerases are highly conserved enzymes involved in multiple processes taking place at the cell nucleus, such DNA replication, chromosome condensation and chromosome segregation. Acrylamide has been shown to inhibit topoisomerase II activity in nuclear extracts (Sciandrello et al., 2010). However, the molecular mechanism of such inhibition is unclear, as acrylamide does not induce DNA cleavage, unlike other sulfhydryl-reactive agents modifying topoisomerase II. Therefore, we inspected the whole protein structure to pinpoint possible Cys sites targeted by ACR and selected two candidates, C170 and C997. C170 is located near the active site of the ATPase domain and thus its modification by ACR could hinder ATP binding, as shown for other ATPases in this study. Instead, C997 belongs to the Toprim (topoisomerase-primase) domain, the catalytic domain involved in DNA strand cleavage and religation. The location of C997 at the protein-DNA interface suggests that its modification by ACR could interfere with Toprim catalytic activity.
Covalent docking to C170 showed that this cysteine is too buried inside the protein to allow formation of the adduct. In contrast, docking to C997 showed a favorable docking score of −8.4 a.u. for the top (best scored) cluster, thus supporting this cysteine as the primary site of ACR modification. This is in line with acrylamide being described as a catalytic inhibitor that reduces the amount of catalytically competent enzyme sensitive to the topoisomerase II poison etoposide (Sciandrello et al., 2010 Table  indicating the target protein, reactive Cys and docking cluster considered, together with the list of detected protein-ligand HBs and their respective frequency.

Analysis of hydrophobic interactions
Besides the H-bonds described in the main text (see section 3.4), we also analyzed the hydrophobic interactions between the two methylene carbon atoms of the ACR adduct and nearby protein residues, as explained in the Methods (section 2.4 in the main text). The results are presented in Table S5. No clear preference for aliphatic/aromatic or small/branched amino acids was identified. We speculate that this is due to the small size and flexibility of the ACR adduct, which allow the ligand to form diverse (and non-directional) hydrophobic interactions.

Dependence of the covalent docking results on the input structure
As mentioned in the main text (see section 4), the docking approach is fully flexible, so that the protein structure can adapt to the ACR covalent adduct. Nonetheless, we checked whether the results were robust with respect to the initial protein structure. As a test case, we chose GAPDH because (i) there are several experimental structures available and (ii) it contains three Cys residues that have been shown experimentally to be modified by ACR (C152, C156 and C247). Besides the protein structure reported in the main text (PDB code 4WNC), we tested two more (PDB codes 1U8F and 6YND), solved at different resolution (see Table S6). Regardless of the input structure, the primary ACR binding site C152 still exhibits a better docking score than the two Cys residues modified only at high ACR concentrations (C156 and 247). Therefore, this test case further supports that covalent docking can be used to pinpoint the most reactive Cys site within a given protein.